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P ■ Abstract: 



We study the thermodynamic behavior of a model protein with 54 amino acids that 
is designed to form a three-helix bundle in its native state. The model contains three 
types of amino acids and five to six atoms per amino acid, and has the Ramachandran 
torsion angles as its only degrees of freedom. The force field is based on hydrogen 
bonds and effective hydrophobicity forces. We study how the character of the collapse 
transition depends on the strengths of these forces. For a suitable choice of these two 
parameters, it is found that the collapse transition is first-order-like and coincides 
with the folding transition. Also shown is that the corresponding one- and two-helix 
segments make less stable secondary structure than the three-helix sequence. 

Keywords: protein folding, folding thermodynamics, hydrogen bonds, hydrophobic- 
ity 
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1 Introduction 



The study of the formation of the native structures of proteins is hampered by compu- 
tational limitations and uncertainties about the relevant forces, which makes model 
building a delicate and highly relevant task. Most current models use one or both 
of two quite drastic approximations, the lattice and Go |jj approximations, where 
the latter amounts to ignoring interactions that do not favor the desired structure. 
Models of these types have provided valuable insights into the physical principles of 
protein folding 0-0], but have their obvious limitations. 



Besides being computationally convenient, lattice models have the important advan- 
tage that it is known what is needed in order for stable and fast-folding chains to 
exist; it can be achieved by using a simple contact potential. For off-lattice mod- 
els this is largely unknown, although one thing that seems clear is that it is not 
enough to simply use a potential analogous to the contact potential [0-[|. Because 
of this uncertainty, and because of evidence that the native structure in itself is a 
major determinant of folding kinetics [[ToU , many off- lattice studies have been based 
on Go-type potentials. 

In this paper, we take a different approach, by discussing an off-lattice model, pro- 



posed in ly], that does not use the Go approximation. In this model, the formation 
of a native structure is driven by hydrogen bonding and effective hydrophobicity 
forces. The model has three types of amino acids and the Ramachandran angles fa 
and ipi [12] as its degrees of freedom. Each amino acid is represented by five or six 
atoms. 

Using this model, we study a three-helix-bundle protein with 54 amino acids, which 
represents a truncated and simplified version of the four-helix-bundle protein de novo 
designed by Regan and Degrado [Tj|. To study size dependence, we also look at the 
behavior of the corresponding one- and two-helix segments. The thermodynamic 
properties of these different chains are explored by using the method of simulated 



tempering |]T4|-[T6 



Two key parameters of this model are the respective strengths ehb and 6aa of the 
hydrogen bonds and hydrophobicity forces. For a suitable choice of these parameters, 
to be denoted by (ehb^AA), the three-helix sequence is found to have the following 
properties |TT| : 



It does form a stable three-helix bundle (except for a twofold topological de- 




Figure 1: Schematic figure showing the representation of one amino acid. 

generacy ) . 

• It undergoes a first-order- like folding transition, from an expanded state to the 
native three-helix-bundle state. 

• It forms more stable secondary structure than the one- and two-helix segments. 

Qualitatively similar results have been obtained previously for C Q |6l,|l7H20H and all- 
atom |2lJ off-lattice chains, but, as far as we know, only with Go-type potentials. 

The paper is organized as follows. In Section |^, we give a brief description of the 
model. Our results are presented in Section H. Here, we first summarize the results 



obtained in [T1J for (ehb,eAA) = (ehb,eAA)- We then discuss how the character of 
collapse transition depends on the relative strength of ehb and 6aa, by studying the 
behavior for (ehb, c aa) = (ehb — K , caa + K ) for different k. We end with a brief 
summary in Section [|. 



2 The Model 



The model we study is a reduced off-lattice model. Figure |l] illustrates the represen- 
tation of one amino acid. The side chain is represented by a single atom, Cp, which 
can be either hydrophobic, polar or absent. This gives us three types of amino acids: 
A with hydrophobic Cp, B with polar Cp, and G (glycine) without Cp. 

The H, O and C^ atoms are all attached to the backbone in a rigid way. Furthermore, 
in the backbone, all bond lengths, bond angles and peptide torsion angles (180°) 



are held fixed. This leaves us with two degrees of freedom per amino acid, the 
Ramachandran torsion angles (pi and ipi (see Figure 0). 

Our energy function 

E = E\ oc + E sa + E hh + E AA (1) 

is composed of four terms. The local potential E\ oc has a standard form with threefold 
symmetry, 

£ioc = I £(! + cos 30,) + ^ E(! + cos 3 ^) • (2) 

The self-avoidance term E sa is given by a hard-sphere potential of the form 

^ = esa E'(^) 12 , (3) 

where the sum runs over all possible atom pairs except those consisting of two hy- 
drophobic C/j. The hydrogen-bond term E^b is given by 

E hh = e hb J] u^jufajj, /%) , (4) 

0' 

where 

/ a x f cos 2 Qi,- cos 2 /% a ij ,(3 ij >90° , . 

<<**&) = jo otherwise (6) 

Here, i and j represent H and O atoms, respectively, and r^ denotes the HO distance, 
a,ij the NHO angle, and fy the HOC angle. The last term in Equation |T| is the 
hydrophobicity energy Eaa, which has the form 

A Tin ) \ Tin J 



Eaa — ^aa z2 

i<j 



f(TAA\ 12 n f^AA\ 6 ' , ? x 



where both i and j represent hydrophobic Cp. To speed up the simulations, a cutoff 
radius r c is used^ which is 4. 5 A for E sa and -Ehb, and 8A for Eaa- 

The parameters of the energy function were determined empirically based on the 
shape of the Ramachandran <pi,ipi distribution and on the overall thermodynamic 
behavior of the three-helix-bundle protein. A complete list of energy and geometry 



2 The cutoff procedure is /(r) i— » f(r) where f(r) = f(r) — f(r c ) — (r — r c )f'(r c ) if r < r c and 
f(r) = otherwise. 
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Figure 2: i; ipi scatter plots for non-glycine and glycine, as obtained by simulations 
of the chains GXG for X=A/B and X=G, respectively, at kT = 0.625 (shown is fa, ipi 
for X). 



1H 
2H 
3H 



BBABBAABBABBAABB 
1H-GGG-1H 
1H-GGG 1H GGG 1H 



Table 1: The sequences studied. 



parameters can be found in [p| . In Figure ||, we show the final fa,ipi distributions 
for non-glycine (A and B) and glycine. 



As mentioned in the introduction, we study the model for different (ehb,£AA)- For 
(ehb,£AA) = (ehb,eAA) = (2.8,2.2) (dimensionless units), it turns out that 

~e hh /kT{ w 4.3 e A A/kT{ w 3.4 , (8) 

where 7f denotes the folding temperature of the three-helix-bundle protein (see be- 
low). 

The three sequences studied are listed in Table [IJ. They contain 16, 35 and 54 amino 
acids, respectively. The one-helix segment 1H consists of A and B amino acids that 
are distributed in such a way that this segment can form a helix with all hydrophobic 
amino acids on the same side. The three-helix sequence, 3H, consists of three such 
stretches of As and Bs plus two GGG segments. 



There have been several earlier studies of similar-sized helical proteins using models 



at comparable levels of resolution |I8" , p2}-p7| . Among these studies, most similar to 
ours is that of Takada et al. [^7| . These authors studied the same sequences, using a 
somewhat similar chain representation and a different, more elaborate force field. 
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Figure 3: The specific heat C v = {(E 2 ) — (E) 2 )/NkT 2 against temperature for the 
sequences 1H (o), 2H (x) and 3H (+) (see Table |), for (a) « = and (b) k = 0.3 
(N denotes the number of amino acids). The full lines represent single-histogram 
extrapolations p8 |. Dotted lines are drawn to guide the eye. 



3 Results 



Using simulated tempering, we study the thermodynamic behavior of the chains 
defined above for 

ehb = ehb - « e A A = e A A + « (9) 

for different k. 



3.1 Balance between Hydrogen Bonds and Hydrophobicity 
Forces 



We begin with a summary of the results obtained in JO] for k = 0. 



For this choice of (ehb,£AA) 5 it turns out that the three-helix sequence exhibits an 
abrupt collapse transition, signaled by a sharp peak in the specific heat. This can 
be seen from Figures |3]a and |], which show the specific heat and radius of gyration, 
respectively, against temperature. 

It is instructive to look at how the results depend on chain length near the transition. 
Two important observations are the following: 



The peak in the specific heat gets stronger with increasing chain length. The 
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Figure 4: Radius of gyration (in A) against temperature for the the three-helix 
sequence, for k = (o) and k = 0.3 (+). 
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Figure 5: Hydrogen-bond energy per amino acid, E^/N, against temperature for the 
sequences 1H (o), 2H (x) and 3H (+) (see Table §, for (a) k = and (b) k = 0.3. 

increase in height is not inconsistent with a linear size dependence, which is 
what one would expect at a conventional first-order phase transition with a 
latent heat. 

• The decrease in hydrogen-bond energy per amino acid, E-^/N, with decreasing 
temperature gets more rapid with increasing chain length, as shown in Fig- 
ure ||a. This implies that the three-helix protein makes more stable secondary 
structure than the one- and two-helix segments. 



It turns out that the sequence 3H does form a three-helix bundle at low temperatures. 
This bundle can have two distinct topologies; if we let the first two helices form a 
U, then the third helix can be either in front of or behind that U. The model is, 
not unexpectedly, unable to discriminate between these two possibilities. To char- 
acterize low-temperature conformations, we therefore determined two representative 





Figure 6: Representative low-temperature structures, FU and BU, respectively. 
Drawn with RasMol [^ . 
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Figure 7: (a) Qfu, Qbv scatter plot (see Equation [10]) at the collapse temperature, 
(b) Free-energy profile F(Q) at the same temperature. 



structures, one for each topology, which, following p7[| , are referred to as FU and BU, 
respectively. These structures are shown in Figure f| Given an arbitrary conforma- 
tion, we then measure the root-mean-square deviations Si (i = FU, BU) from these 
two structures (calculated over all backbone atoms). These deviations are converted 
into similarity parameters Qi by using 



Q t = exp(-^ 2 /100A 2 ) . 



(10) 



At high temperatures, both Qi tend to be small. At low temperatures, the system 
spends most of its time close to one or the other of the structures FU and BU; either 
Qfu or Qbu is close to 1. Finally, at the collapse temperature, all three of these 
regions are populated, as can be seen from Figure 0a. In particular, this implies that 
folding and collapse occur at the same temperature. 

In Figure 0b, we show the free-energy profile F(Q) at the folding temperature, where 
Q = max(<5FU; Qbu) is taken as a measure of "nativeness" . The free energy has a 
relatively sharp minimum at Q w 0.9, corresponding to 5 = min(<5Fu, ^bu) ~ 3A. 
This is followed by a weak barrier around Q = 0.7, corresponding to 5 ~ 6A. Finally, 



there is a broad minimum at small Q, where Q = 0.2 corresponds to 5 ~ 13A. In [IT 
it was shown that the low-Q minimum corresponds to expanded structures with a 
varying secondary-structure content. 

In particular, these results show that the three-helix sequence exhibits a first-order- 
like collapse transition that coincides with its folding transition. This is the behavior 
for k = (see Equation §). Next we discuss the character of the collapse transition 
for k^O, starting with positive k. 



3.2 Dominant Hydrophobicity Forces 



A positive k means strong hydrophobicity forces and weak hydrogen bonds. For 
small positive k, the collapse temperature remains approximately the same as for 
K — 0. However, the transition gets weaker with increasing k. This is illustrated in 
Figures |b and §, using data obtained for k = 0.3. 

In Figure |3|b, we show the specific heat for k = 0.3. Compared to the k = results 
(see Figure 0a), we see that the peak in the specific heat is lower, and that the chain- 
length dependence is weaker. There is no sign that the chain collapse is first-order-like 
for k = 0.3. 

Consistent with the data for the specific heat, we see from Figure § that the radius 
of gyration changes more slowly with temperature for k = 0.3 than for k — 0. 

It is also interesting to look at the secondary-structure content. From Figure ||, it 
can be seen that the hydrogen-bond energy E^b is considerably higher for k = 0.3 
than for k = 0. In particular, the results show that the secondary-structure content 
at the collapse temperature is lower for k = 0.3. 

For k = 0.3, we furthermore find that the three-helix sequence does not show struc- 
tural stability at temperatures immediately below the collapse transition (data not 
shown), so the folding temperature is different from and lower than the collapse tem- 
perature in this case. Between these two temperatures, the chain exists in a compact 
(molten globule) state without specific structure. 



-65 




Figure 8: Three-helix-bundle (full line) and one-helix (dashed line) energies against 
K (see the text). 



3.3 Dominant Hydrogen Bonds 



We now turn to negative k, meaning strong hydrogen bonds and weak hydrophobicity 
forces. It is clear that the three-helix sequence will form one long helix rather than a 
helical bundle if k is made too large negative. To get an idea of when this happens, 
we compare the energies of an optimized three-helix-bundle conformation and an 
optimized rodlike conformation, for different K. These conformations were generated 
as follows. 

Starting at K — 0, we quenched a large number of low-temperature Monte Carlo con- 
formations to zero temperature, by using a conjugate-gradient method. The structure 
with the lowest energy found is the BU structure in Figure |6|. This structure is taken 
as our three-helix-bundle conformation at K — 0. Our rodlike k = conformation 
was also obtained by a conjugate- gradient minimization, starting from a long "ideal" 
helix. 

We then performed energy minimizations at successively lower k, each time taking 
the optimized conformations from the previous k as our two starting points. The 
two sets of energies obtained this way are shown as functions of k in Figure §. We 
see that the curves cross at n ~ —0.15. Although there may well exist three-helix- 
bundle energies that are somewhat lower than those in Figure §, these results strongly 
suggest that the ground state turns into one long helix already at a relatively small 
negative k. 
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4 Summary 



The calculations discussed in this paper can be divided into two parts. First, we 
showed that the three-helix-bundle protein, for a suitable choice (e^,, caa) of the 
parameters (ehb,CAA), indeed has the properties listed in the introduction. Let us 
stress that we find these properties without resorting to the Go approximation. This 
is important as many current models rely on this approximation |6|, p!7|-pl| , based 



on the view that the folding properties are strongly influenced by the native topol- 
ogy, whereas energetic frustration plays a less important role. The results presented 
in this paper are consistent with this view, but it is clear that further studies are 
needed in order to properly understand the consequences and applicability of the Go 
approximation. 

In the second part, we presented results obtained for (e^b, caa) = (ehb — ^, caa + «) for 
different k. Not unexpectedly, it turns out that the folding behavior depends critically 
on the relative strength of the parameters ehb and caa- In particular, we saw that 
a first-order- like collapse to a three-helix-bundle state is observed only in a narrow 
window around k = 0; a proper balance between hydrogen bonds and hydrophobicity 
forces is required for the chain to show this behavior. 

The fact that the dependence on these parameters is strong may seem unwanted, 
but is not physically unreasonable. In fact, the situation is somewhat reminiscent 



of what has been found for homopolymers with stiffness |30|-p3| , with the hydrogen 
bonds playing the role of the stiffness term. Note also that the incorporation of full 
side chains will make the chains intrinsically stiffer, which might lead to a weaker 
dependence on the hydrogen-bond strength ehb- 
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